Speech Emotion Recognition Using Amplitude Modulation Parameters
نویسندگان
چکیده
In the community of Human Computer Interface (HCI) researchers have been working for several years in trying to emulate a human communication system, using innovative technologies and methodologies, based on the emotion recognition in facial expressions and speech [1-3]. Speech emotion recognition (SER) [4] is a challenging framework in demanding human machine interaction systems. Standard approaches based on the categorical model of emotions reach low performance, probably due to the modelization of emotions as distinct and independent affective states. Starting from the recently investigated assumption on the dimensional circumplex model of emotions [5,6], SER systems are structured as the prediction of valence and arousal on a continuous scale in a two-dimensional domain. In this study, we propose the use of a PLS regression model, optimized according to specific features selection procedures and trained on the Italian speech corpus EMOVO, suggesting a way to automatically label the corpus in terms of arousal and valence. In order to label the dataset according to valence and arousal, the circumplex diagram was divided into angular sectors, each having a specific angular range and a central angle (ak) as shown in Fig. 1. Each emotion was also identified by the coordinates (yve, yae), e = 1, . . . ,Ne, with Ne the number of emotions.
منابع مشابه
Amplitude modulation features for emotion recognition from speech
The goal of speech emotion recognition (SER) is to identify the emotional or physical state of a human being from his or her voice. One of the most important things in a SER task is to extract and select relevant speech features with which most emotions could be recognized. In this paper, we present a smoothed nonlinear energy operator (SNEO)-based amplitude modulation cepstral coefficients (AM...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملModulation Spectral Features for Predicting Vocal Emotion Recognition by Simulated Cochlear Implants
It has been reported that vocal emotion recognition is challenging for cochlear implant (CI) listeners due to the limited spectral cues with CI devices. As the mechanism of CI, modulation information is provided as a primarily cue. Previous studies have revealed that the modulation components of speech are important for speech intelligibility. However, it is unclear whether modulation informati...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کامل